Differential Privacy for Social Science Inference

نویسندگان

  • Vito D’Orazio
  • James Honaker
  • Gary King
  • Natalie Carvalho
  • Vishesh Karwa
  • Jack Murtagh
  • Kobbi Nissim
چکیده

Social scientists often want to analyze data that contains sensitive personal information that must remain private. However, common techniques for data sharing that attempt to preserve privacy either bring great privacy risks or great loss of information. A long literature has shown that anonymization techniques for data releases are generally open to reidentification attacks. Aggregated information can reduce but not prevent this risk, while also reducing the utility of the data to researchers. Even publishing statistical estimates without releasing the data cannot guarantee that no sensitive personal information has been leaked. Differential Privacy, deriving from roots in cryptography, is one formal, mathematical conception of privacy preservation. It brings provable guarantees that any reported result does not reveal information about any one single individual. In this paper we detail the construction of a secure curator interface, by which researchers can have access to privatized statistical results from their queries without gaining any access to the underlying raw data. We introduce differential privacy and the construction of differentially private summary statistics. We then present new algorithms for releasing differentially private estimates of causal effects and the generation of differentially private covariance matrices from which any least squares regression may be estimated. We demonstrate the application of these methods through our curator interface. ∗For discussions and comments we thank Natalie Carvalho, Vishesh Karwa, Jack Murtagh, Kobbi Nissim, Or Sheffet, Adam Smith, Salil Vadhan, and numerous other members of the “Privacy Tools for Sharing Research Data” project http://privacytools.seas.harvard.edu. This work was supported by the NSF (CNS-1237235), the Alfred P. Sloan Foundation and a Google gift. †Assistant Professor in the School of Economic, Political, and Policy Sciences at the University of TexasDallas. ‡Senior Research Scientist, Institute for Quantitative Social Science, 1737 Cambridge Street, Cambridge, MA 02138 ([email protected], http://hona.kr) §Albert J. Weatherhead III University Professor, Harvard University, Institute for Quantitative Social Science, 1737 Cambridge Street, Cambridge, MA 02138 ([email protected], http://GaryKing.org)

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sharing Social Network Data: Differentially Private Estimation of Exponential-Family Random Graph Models

Motivated by a real-life problem of sharing social network data that contain sensitive personal information, we propose a novel approach to release and analyze synthetic graphs in order to protect privacy of individual relationships captured by the social network while maintaining the validity of statistical results. Two case studies demonstrate the application and usefulness of the proposed te...

متن کامل

Privacy-Preserving Inference of Social Relationships from Location Data

Social relationships between people, e.g., whether they are friends with each other, can be inferred by observing their behaviors in the real world. Due to the popularity of GPSenabled mobile devices or online services, a large amount of high-resolution spatiotemporal location data becomes available for such inference studies. However, due to the sensitivity of location data and user privacy co...

متن کامل

Dependence Makes You Vulnerable: Differential Privacy Under Dependent Tuples

Differential privacy (DP) is a widely accepted mathematical framework for protecting data privacy. Simply stated, it guarantees that the distribution of query results changes only slightly due to the modification of any one tuple in the database. This allows protection, even against powerful adversaries, who know the entire database except one tuple. For providing this guarantee, differential p...

متن کامل

Differentially Private Local Electricity Markets

Privacy-preserving electricity markets have a key role in steering customers towards participation in local electricity markets by guarantying to protect their sensitive information. Moreover, these markets make it possible to statically release and share the market outputs for social good. This paper aims to design a market for local energy communities by implementing Differential Privacy (DP)...

متن کامل

Dependence Makes You Vulnberable: Differential Privacy Under Dependent Tuples

Differential privacy (DP) is a widely accepted mathematical framework for protecting data privacy. Simply stated, it guarantees that the distribution of query results changes only slightly due to the modification of any one tuple in the database. This allows protection, even against powerful adversaries, who know the entire database except one tuple. For providing this guarantee, differential p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015